Parker County
Vairiational Stochastic Games
The Control as Inference (CAI) framework has successfully transformed single-agent reinforcement learning (RL) by reframing control tasks as probabilistic inference problems. However, the extension of CAI to multi-agent, general-sum stochastic games (SGs) remains underexplored, particularly in decentralized settings where agents operate independently without centralized coordination. In this paper, we propose a novel variational inference framework tailored to decentralized multi-agent systems. Our framework addresses the challenges posed by non-stationarity and unaligned agent objectives, proving that the resulting policies form an $\epsilon$-Nash equilibrium. Additionally, we demonstrate theoretical convergence guarantees for the proposed decentralized algorithms. Leveraging this framework, we instantiate multiple algorithms to solve for Nash equilibrium, mean-field Nash equilibrium, and correlated equilibrium, with rigorous theoretical convergence analysis.
A Single Online Agent Can Efficiently Learn Mean Field Games
Zhang, Chenyu, Chen, Xu, Di, Xuan
Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems. However, solving MFGs can be challenging due to the coupling of forward population evolution and backward agent dynamics. Typically, obtaining mean field Nash equilibria (MFNE) involves an iterative approach where the forward and backward processes are solved alternately, known as fixed-point iteration (FPI). This method requires fully observed population propagation and agent dynamics over the entire spatial domain, which could be impractical in some real-world scenarios. To overcome this limitation, this paper introduces a novel online single-agent model-free learning scheme, which enables a single agent to learn MFNE using online samples, without prior knowledge of the state-action space, reward function, or transition dynamics. Specifically, the agent updates its policy through the value function (Q), while simultaneously evaluating the mean field state (M), using the same batch of observations. We develop two variants of this learning scheme: off-policy and on-policy QM iteration. We prove that they efficiently approximate FPI, and a sample complexity guarantee is provided. The efficacy of our methods is confirmed by numerical experiments.
Correlated Mean Field Imitation Learning
Zhao, Zhiyu, Yang, Ning, Yan, Xue, Zhang, Haifeng, Wang, Jun, Yang, Yaodong
We investigate multi-agent imitation learning (IL) within the framework of mean field games (MFGs), considering the presence of time-varying correlated signals. Existing MFG IL algorithms assume demonstrations are sampled from Mean Field Nash Equilibria (MFNE), limiting their adaptability to real-world scenarios. For example, in the traffic network equilibrium influenced by public routing recommendations, recommendations introduce time-varying correlated signals into the game, not captured by MFNE and other existing correlated equilibrium concepts. To address this gap, we propose Adaptive Mean Field Correlated Equilibrium (AMFCE), a general equilibrium incorporating time-varying correlated signals. We establish the existence of AMFCE under mild conditions and prove that MFNE is a subclass of AMFCE. We further propose Correlated Mean Field Imitation Learning (CMFIL), a novel IL framework designed to recover the AMFCE, accompanied by a theoretical guarantee on the quality of the recovered policy. Experimental results, including a real-world traffic flow prediction problem, demonstrate the superiority of CMFIL over state-of-the-art IL baselines, highlighting the potential of CMFIL in understanding large population behavior under correlated signals.
Recent Advances in Modeling and Control of Epidemics using a Mean Field Approach
Roy, Amal, Singh, Chandramani, Narahari, Y.
Modeling and control of epidemics such as the novel Corona virus have assumed paramount importance at a global level. A natural and powerful dynamical modeling framework to use in this context is a continuous time Markov decision process (CTMDP) that encompasses classical compartmental paradigms such as the Susceptible-Infected-Recovered (SIR) model. The challenges with CTMDP based models motivate the need for a more efficient approach and the mean field approach offers an effective alternative. The mean field approach computes the collective behavior of a dynamical system comprising numerous interacting nodes (where nodes represent individuals in the population). This paper (a) presents an overview of the mean field approach to epidemic modeling and control and (b) provides a state-of-the-art update on recent advances on this topic. Our discussion in this paper proceeds along two specific threads. The first thread assumes that the individual nodes faithfully follow a socially optimal control policy prescribed by a regulatory authority. The second thread allows the individual nodes to exhibit independent, strategic behavior. In this case, the strategic interaction is modeled as a mean field game and the control is based on the associated mean field Nash equilibria. In this paper, we start with a discussion of modeling of epidemics using an extended compartmental model - SIVR and provide an illustrative example. We next provide a review of relevant literature, using a mean field approach, on optimal control of epidemics, dealing with how a regulatory authority may optimally contain epidemic spread in a population. Following this, we provide an update on the literature on the use of the mean field game based approach in the study of epidemic spread and control. We conclude the paper with relevant future research directions.
Learning Correlated Equilibria in Mean-Field Games
Muller, Paul, Elie, Romuald, Rowland, Mark, Lauriere, Mathieu, Perolat, Julien, Perrin, Sarah, Geist, Matthieu, Piliouras, Georgios, Pietquin, Olivier, Tuyls, Karl
The designs of many large-scale systems today, from traffic routing environments to smart grids, rely on game-theoretic equilibrium concepts. However, as the size of an $N$-player game typically grows exponentially with $N$, standard game theoretic analysis becomes effectively infeasible beyond a low number of players. Recent approaches have gone around this limitation by instead considering Mean-Field games, an approximation of anonymous $N$-player games, where the number of players is infinite and the population's state distribution, instead of every individual player's state, is the object of interest. The practical computability of Mean-Field Nash equilibria, the most studied Mean-Field equilibrium to date, however, typically depends on beneficial non-generic structural properties such as monotonicity or contraction properties, which are required for known algorithms to converge. In this work, we provide an alternative route for studying Mean-Field games, by developing the concepts of Mean-Field correlated and coarse-correlated equilibria. We show that they can be efficiently learnt in \emph{all games}, without requiring any additional assumption on the structure of the game, using three classical algorithms. Furthermore, we establish correspondences between our notions and those already present in the literature, derive optimality bounds for the Mean-Field - $N$-player transition, and empirically demonstrate the convergence of these algorithms on simple games.